19 research outputs found

    Aplicación de selección de características, métricas de aprendizaje y reducción de dimensión en sistemas de detección de intrusos /

    Get PDF
    Las redes de computadores inicialmente fueron diseñadas para una cantidad limitada de usuarios, hoy día se presentan como una necesidad para los hogares, pequeñas, medianas y grandes organizaciones. Los malos diseños de estructura de las redes de computadores han generado brechas de seguridad para mantener la integralidad, confidencialidad y disponibilidad de la información que es transferida por dicho medio, por ello existe la necesidad de proponer nuevas estrategias que permitan la identificación de ingresos no autorizados a las redes de computadores. El desarrollo de esta investigación tiene como propósito la aplicación de técnicas de selección de características, métricas de aprendizaje y reducción de dimensión en sistemas de detección de intrusos, utilizando los datos almacenados en el dataset NSL-KDD, el cual contiene 225.000 registros de conexiones en una red de computadores con 41 características.Incluye bibliografía, anexo

    Dataset for estimation of obesity levels based on eating habits and physical condition in individuals from Colombia, Peru and Mexico

    Get PDF
    This paper presents data for the estimation of obesity levels in individuals from the countries of Mexico, Peru and Colombia, based on their eating habits and physical condition. The data contains 17 attributes and 2111 records, the records are labeled with the class variable NObesity (Obesity Level), that allows classification of the data using the values of Insufficient Weight, Normal Weight, Overweight Level I, Overweight Level II, Obesity Type I, Obesity Type II and Obesity Type III. 77% of the data was generated synthetically using the Weka tool and the SMOTE filter, 23% of the data was collected directly from users through a web platform. This data can be used to generate intelligent computational tools to identify the obesity level of an individual and to build recommender systems that monitor obesity levels. For discussion and more information of the dataset creation, please refer to the full-length article “Obesity Level Estimation Software based on Decision Trees” (De-La-Hoz-Correa et al., 2019).Universidad de la Cost

    Deep learning of robust representations for multi-instance and multi-label image classification

    Get PDF
    In multi-instance problems (MIL), an arbitrary number of instances is associated with a class label. Therefore, the labeling of training data becomes simpler (since it is done together, instead of individually) with the disadvantage that a weakly supervised database is produced [9]. In the PCRY, each restaurant is represented by a set of images that share the attribute label(s) of that establishment. This paper explores the use of previously learned attribute extractors, trained in 3 different databases that are similar and complementary to the PCRY databas

    Association rules implementation for affinity analysis between elements composing multimedia objects

    Get PDF
    The multimedia objects are a constantly growing resource in the world wide web, consequently it has generated as a necessity the design of methods and tools that allow to obtain new knowledge from the information analyzed. Association rules are a technique of Data Mining, whose purpose is to search for correlations between elements of a collection of data (data) as support for decision making from the identification and analysis of these correlations. Using algorithms such as: A priori, Frequent Parent Growth, QFP Algorithm, CBA, CMAR, CPAR, among others. On the other hand, multimedia applications today require the processing of unstructured data provided by multimedia objects, which are made up of text, images, audio and videos. For the storage, processing and management of multimedia objects, solutions have been generated that allow efficient search of data of interest to the end user, considering that the semantics of a multimedia object must be expressed by all the elements that composed of. In this article an analysis of the state of the art in relation to the implementation of the Association Rules in the processing of Multimedia objects is made, in addition the analysis of the consulted literature allows to generate questions about the possibility of generating a method of association rules for the analysis of these objects.Universidad de la Costa, Universidad Pontificia Bolivariana

    Classification and features selection method for obesity level prediction

    Get PDF
    Obesity has become one of the world’s largest health issues, rich and poor countries, without exception, have each year larger populations with this condition. Obesity and overweight are defined as abnormal or excessive fat accumulation that may impair health according to the World Health Organization (WHO) and has nearly tripled since 1975. Data Mining and their techniques have become a strong scientific field to analyze huge data sources and to provide new information about patterns and behaviors from the population. This study uses data mining techniques to build a model for obesity prediction, using a dataset based on a survey for college students in several countries. After cleaning and transformation of the data, a set of classification methods was implemented (Logistic Model Tree - LMT, RandomForest - RF, Multi-Layer Perceptron - MLP and Support Vector Machines - SVM), and the feature selection methods InfoGain, GainRatio, Chi-Square and Relief, finally, crossed validation was performed for the training and testing processes. The data showed than LMT had the best performance in precision, obtaining 96.65%, compared to RandomForest (95.62%), MLP (94.41%) and SMO (83.89%), so this study shows that LMT it can be used with confidence to analyze obesity and similar data

    Estado del arte del proyecto

    Get PDF
    The aim of REMIND is to create an International and Intersectoral network to facilitate the exchange of staff to progress developments in reminding technologies for persons with dementia that can be deployed in smart environments. The consortium is comprised of an International network of 7 academic beneficiaries, 5 nonacademic beneficiaries and 4 partners from Third Countries, all of whom are committed to progressing the notion of reminding technologies within smart environments. The focus of REMIND is to develop staff and beneficiary/partner skills in the areas of user centered design and behavioral science coupled with improved computational techniques which in turn will offer more appropriate and efficacious reminding solutions. This will be further supported through research involving user centric studies into the use of reminding technologies and the theory of behaviour change to improve compliance of usage. Research objectives will be focused within the domain of smart environments. A smart environment can be viewed as having the ability to sense its surroundings through embedded sensors and following processing of the sensed information, adjust the environment through actuators to offer an improved experience for the inhabitant. Even though the availability, cost, size and battery life of sensing technology have all improved in recent years, the uptake of real smart environments has been limited. This is mainly related to the effort required to support the technical deployments and the lack of a business model to support a service provider capable of offering support to a large number of environments. In addition, there is a limit to the amount of scenarios which can be facilitated by such environments; this limit is directly related to the number of sensors availabl

    Feature selection, learning metrics and dimension reduction in training and classification processes in intrusion detection systems

    Get PDF
    This research presents an IDS prototype in Matlab that assess network traffic connections contained in the NSL-KDD dataset, comparing feature selection techniques available in FEAST toolbox, refining prior results applying dimension reduction technique ISOMAP. The classification process used a supervised learning technique called Support Vector Machines (SVM). The comparative analysis related to detection rates by attack category are conclusive that MRMR+PCA+SVM (selection, reduction and classification techniques) combined obtained more promising results, just using 5 of 41 available features in the dataset. The results obtained were: 85.42% normal traffic, 80.77% DoS, 90.41% Probe, 91.78% U2R and 83.25% R2L

    Application of feast (Feature Selection Toolbox) in ids (Intrusion detection Systems)

    Get PDF
    Security in computer networks has become a critical point for many organizations, but keeping data integrity demands time and large economic investments, in consequence there has been several solution approaches between hardware and software but sometimes these has become inefficient for attacks detection. This paper presents research results obtained implementing algorithms from FEAST, a Matlab Toolbox with the purpose of selecting the method with better precision results for different attacks detection using the least number of features. The Data Set NSL-KDD was taken as reference. The Relief method obtained the best precision levels for attack detection: 86.20%(NORMAL), 85.71% (DOS), 88.42% (PROBE), 93.11%(U2R), 90.07(R2L), which makes it a promising technique for features selection in data network intrusions

    Method based on data mining techniques for breast cancer recurrence analysis

    Get PDF
    Cancer is a constantly evolving disease, which affects a large number of people worldwide. Great efforts have been made at the research level for the development of tools based on data mining techniques that allow to detect or prevent breast cancer. The large volumes of data play a fundamental role according to the literature consulted, a great variety of dataset oriented to the analysis of the disease has been generated, in this research the Breast Cancer dataset was used, the purpose of the proposed research is to submit comparison of the J48 and randomforest, NaiveBayes and NaiveBayes Simple, SMO Poli-kernel and SMO RBF-Kernel classification algorithms, integrated with the Simple K-Means cluster algorithm for the generation of a model that allows the successful classification of patients who are or Non-recurring breast cancer after having previously undergone surgery for the treatment of said disease, finally the methods that obtained the best levels were SMO Poly-Kernel + Simple K-Means 98.5% of Precision, 98.5% recall, 98.5% TPRATE and 0.2% FPRATE. The results obtained suggest the possibility of using intelligent computational tools based on data mining methods for the detection of breast cancer recurrence in patients who had previously undergone surgery

    RDF query and protocols language using for description and representation of web ontologies

    Get PDF
    The purpose of this article is to expose the metadata structure based on RDF (Resource Description Framework) and the way in which queries can be made using SPARQL (Protocol and RDF Query Language), as a principle for searching the Semantic Web. It also describes what must be considered to build a Web Ontology and the tools that can help the Software developer to make querys using SPARQL
    corecore